Open RStudio and start a new script or continue the one from last week
Set your working directory to where you downloaded and unzipped the files
Follow along by opening r-geospatial-workshop-pt2.html in a web browser
Make sure required libraries are installed.
Recap Part I
Map spatial objects with tmap
Taste of Spatial Analysis
In Part I, we:
ggplot and ggmapggmapSpatialPointsDataFrames with sp::coordinatesrgdal::readOGRCRSs with sp::proj4string and sp::CRSsp::spTransformsfboundary_lonlat as a shapefile using rgdal::writeOGRLet’s load the libraries we will use
library(sp) # spatial objects and methods
library(rgdal) # read/write from file; manage CRSs
library(rgeos) # geometric operations
library(tmap) # mapping spatial objects
## Warning: package 'sp' was built under R version 3.4.3
## Warning: package 'rgdal' was built under R version 3.4.3
## rgdal: version: 1.2-16, (SVN revision 701)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 2.1.3, released 2017/20/01
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/gdal
## GDAL binary built with GEOS: FALSE
## Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/proj
## Linking to sp version: 1.2-5
## Warning: package 'rgeos' was built under R version 3.4.2
## rgeos version: 0.3-26, (SVN revision 560)
## GEOS runtime version: 3.6.1-CAPI-1.10.1 r0
## Linking to sp version: 1.2-5
## Polygon checking: TRUE
## Warning: package 'tmap' was built under R version 3.4.3
Use setwd to set your working directory to the location of the tutorial files.
For example:
setwd("~/Documents/Dlab/workshops/2018/rgeo/r-geospatial-workshop/r-geospatial-workshop")
# Read in from CSV file
sfhomes <- read.csv('data/sf_properties_25ksample.csv',
stringsAsFactors = FALSE)
# subset the data
sfhomes15 <- subset(sfhomes, as.numeric(SalesYear) == 2015)
sfhomes15_sp <- sfhomes15 # Make a copy
# Make it spatial
coordinates(sfhomes15_sp) <- c('lon','lat')
# Read in the Bart data from CSV file
bart <- read.csv("./data/bart.csv", stringsAsFactors = F)
Read in the sf_boundary.shp file with rGDAL
sfboundary <- readOGR(dsn="data", layer="sf_boundary")
## OGR data source with driver: ESRI Shapefile
## Source: "data", layer: "sf_boundary"
## with 1 features
## It has 3 fields
*Also called defining and reprojecting
# Define the CRS
proj4string(sfhomes15_sp) <- CRS("+init=epsg:4326")
# Transform CRS
sfboundary_lonlat <- spTransform(sfboundary, CRS("+init=epsg:4326"))
# Read in the `sf_highways.shp` file with `rGDAL`
highways <- readOGR(dsn="data", layer="sf_highways")
## OGR data source with driver: ESRI Shapefile
## Source: "data", layer: "sf_highways"
## with 246 features
## It has 5 fields
# Transform to geographic coords
highways_lonlat <- spTransform(highways, CRS("+init=epsg:4326"))
base::plot, ggplot, ggmap for geospatial data in data frames
sp::plot for sp objects
Use it to create quick maps
Can be used to create great maps
See examples in sp Gallery: Plotting maps with sp
tmap stands for thematic map
It’s a powerful toolkit for creating maps with sp objects
yet, with less code than the alternatives
Syntax should be familar to ggplot2 users, but simpler
Relatively easy to create & save interactive maps
Load the library
library(tmap)
You may need to install /load dependencies - ggplot2, RColorbrewer, classInt, leaflet libraries
qtm(sfboundary)
Similar to the plot()
Now, make a qtm of sfhomes15_sp
qtm(sfboundary)
tmap has 2 modes:
plot for static maps
view for interactive maps
Use the command tmap_mode to toggle between them:
tmap_mode("plot")
tmap_mode("view")
Create an interactive tmap by setting the mode to view.
Then recreate the qtm of sfhomes15_sp
tmap_mode("view") # set tmap to interactive view mode
qtm(sfhomes15_sp) # Interactive - click on the points
Check the Popups and Layer selector
## tmap mode set to interactive viewing
tmap_mode("plot")
## tmap mode set to plotting
To customize your tmap you need to use the more complex syntax
tm_shape(sf_boundary) +
tm_polygons(col="beige", border.col="black")
tm_shape(<sp_object>) specifies the sp object
+ tm_<element>(...) specifies the symbology
plus other options for creating a publication ready map
tm_shape(sfboundary) +
tm_polygons(col="beige", border.col="black")
tm_dotstm_dots are a type of tm_symbols()
See ?tm_symbols for other types of point symbols.
tm_shape(sfhomes15_sp) +
tm_dots(col="red", size=.25)
tm_dotstm_shape(sfhomes15_sp) +
tm_dots(col="red", size=.25)
tm_linestm_shape(highways_lonlat) +
tm_lines(col="black")
Column names must be quoted!
tm_shape(sfhomes15_sp) +
tm_dots(col="totvalue", size=.25)
You can overlay multiple sp objects, or layers, on the map, by concatenating the tmap commands with plus signs
# Map the SF Boundary first
tm_shape(sfboundary_lonlat) +
tm_polygons(col="beige", border.col="black") +
# Overlay the highway lines next
tm_shape(highways_lonlat) +
tm_lines(col="black") +
# Then add the house points
tm_shape(sfhomes15_sp) +
tm_dots(col="totvalue", size=.25)
Try the code below
Can you redo the previous map with sfboundary instead of sfboundary_lonlat?
If yes, what does that tell you about tmap?
tm_shape(sfboundary) +
tm_polygons(col="beige", border.col="black") +
tm_shape(highways_lonlat) +
tm_lines(col="black") +
tm_shape(sfhomes15_sp) +
tm_dots(col="totvalue", size=.25)
If the CRSs are defined, tmap will use that info - if not, tmap will assume WGS84 -
and dynamically reproject subsequent layers to match first one added to map
tm_shape(sfboundary_lonlat) +
tm_polygons(col="beige", border.col="black") +
tm_shape(highways_lonlat) +
tm_lines(col="black") +
tm_shape(sfhomes15_sp) +
tm_dots(col="totvalue", size=.25,
title = "San Francisco Property Values (2015)") +
tm_layout(inner.margins=c(.05, .2, .15, .05))
# bottom, left, top, right
Make the previous tmap interactive
tmap_mode("view")
tm_shape(sfboundary_lonlat) +
tm_polygons(col="beige", border.col="black") +
tm_shape(highways_lonlat) +
tm_lines(col="black") +
tm_shape(sfhomes15_sp) +
tm_dots(col="totvalue", size=.25, title = "San Francisco Property Values (2015)") +
tm_layout(inner.margins=c(.05, .2, .15, .05)) # bottom, left, top, right
# OR
#ttm() # Toggle Tmap mode between "interactive" and "plot"
#last_map() # display the last map
## tmap mode set to interactive viewing
What changed in the code? Output map?
tm_shape(sfboundary_lonlat) +
tm_polygons(col="beige", border.col="black") +
tm_shape(highways_lonlat) +
tm_lines(col="black") +
tm_shape(sfhomes15_sp) +
tm_dots(col="totvalue", size=.25,
title = "San Francisco Property Values (2015)",
popup.vars=c("SalesYear","totvalue","NumBedrooms",
"NumBathrooms","AreaSquareFeet")) +
tm_layout(inner.margins=c(.05, .2, .15, .05)) # bottom, left, top, right
Save the map to a named variable (map1)
map1 <- last_map()
map1 # then display it
Use save_tmap and then take a look at your files
save_tmap(map1, "sf_properties.png", height=6) # Static image file with
save_tmap(map1, "sf_properties.html") # interactive web map
## Map saved to /Users/pattyf/Documents/Dlab/workshops/2018/rgeo_may2018/r-geospatial-workshop/sf_properties.png
## Resolution: 2173.828 by 1800 pixels
## Size: 7.246095 by 6 inches (300 dpi)
## Interactive map saved to /Users/pattyf/Documents/Dlab/workshops/2018/rgeo_may2018/r-geospatial-workshop/sf_properties.html
?tmap
?tmap_shape
?tmap_element
?tm_symbols (tm_dots, etc…)
Many ways to do this.
You can share you map online by publishing it to RPubs.
RPubs account to make that work.Enter the name of your tmap object (map1) or your tmap code in the console
In the Viewer window, click on the Publish icon.
In order to perform spatial analysis we need to first convert all data objects to a common CRS.
Which type? Projected or Geographic CRS?
If my goal is to create maps, I may convert all data to a geographic CRS.
If my goal is to do spatial analysis, I will convert to a projected CRS.
Geographic CRSs
4326 Geographic, WGS84 (default for lon/lat)
4269 Geographic, NAD83 (USA Fed agencies like Census)
Projected CRSs
5070 USA Contiguous Albers Equal Area Conic
3310 CA ALbers Equal Area
26910 UTM Zone 10, NAD83 (Northern Cal)
3857 Web Mercator (web maps)
Use spTransform to transform sfhomes15_sp and bart to UTM 10N, NAD83
highways and sfboundary already have this CRSRecall, this transformation is called projecting or reprojecting
First, transform sfhomes15_sp
Note the two methods for doing same thing.
sfhomes15_utm <- spTransform(sfhomes15_sp, CRS("+init=epsg:26910"))
# OR
# sfhomes15_utm <- spTransform(bart, CRS(proj4string(sfboundary)))
The bart data is still a data.frame object.
So it needs to be
Try it….
coordinates(bart) <- c("X","Y")
proj4string(bart) <- CRS("+init=epsg:4326")
bart_utm <- spTransform(bart, CRS(proj4string(sfboundary)))
Do the CRSs all match?
proj4string(bart_utm)
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
proj4string(sfboundary)
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
proj4string(highways)
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
proj4string(sfhomes15_utm)
## [1] "+init=epsg:26910 +proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
Use the CRS of an existing sp object if you want to get them to match exactly!
#sfhomes15_utm <- spTransform(sfhomes15_sp, CRS("+init=epsg:26910"))
sfhomes15_utm <- spTransform(sfhomes15_sp, CRS(proj4string(sfboundary)))
# check
proj4string(sfhomes15_utm)
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
proj4string(sfboundary)
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
Visual check
plot(sfboundary)
lines(highways, col='purple', lwd=4)
points(sfhomes15_utm)
plot(bart_utm, col="red", pch=15, add=T)
Mapping / plotting to see location and distribution
Asking questions of, or querying, your data
Cleaning & reshaping the data
Applying analysis methods
Mapping analysis results
Repeat as needed
There are two key types of spatial queries
These types are often combined, e.g.
rgeosrgeos is THE package for manipulating geometric objects - or the geometry component of an sp object.
We will use it to compute things like area and distance.
We will also use it to compare and intersect geometries and compute new geometries.
For detailed documentation see, ??rgeos
What is the area of San Francisco?
What data would we use to answer that question?
Use rgeos::gArea to compute the area of SpatialPolygon objects
What are the units?
Check results against Wikipedia for SF
gArea(sfboundary)
## [1] 119949901
proj4string(sfboundary) # to get the CRS units
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
gArea(sfboundary) / (1000 * 1000) # Convert to square KM
## [1] 119.9499
Use the rgeos function gLength to compute length of linear spatial objects
gLength(highways) / 1000 # in kilometers
## [1] 39.83624
The rgeos function gDistance will return the min distance between two geometries.
Compute the distance in kilometers between Embarcadero & Powell St Bart stations
gDistance(bart_utm[bart_utm$STATION == 'EMBARCADERO',],
bart_utm[bart_utm$STATION == 'POWELL STREET',]) /1000
## [1] 1.334997
Get distance between all sfhomes and Embarcadero BART
dist2emb<- gDistance(bart_utm[bart_utm$STATION == 'EMBARCADERO',],
sfhomes15_utm, byid=TRUE) /1000
# check output
nrow(dist2emb)
## [1] 835
nrow(sfhomes15_utm)
## [1] 835
head(dist2emb)
## 30
## 24 9.356050
## 35 3.775303
## 76 2.381774
## 79 11.130542
## 85 4.409883
## 90 1.660183
How is the output different if byid=FALSE (default)
?gDistance
#dist2emb<- gDistance(bart_utm[bart_utm$STATION == 'EMBARCADERO',],
# sfhomes15_utm, byid=TRUE) /1000
gDistance(bart_utm[bart_utm$STATION == 'EMBARCADERO',],
sfhomes15_utm) /1000
Spatial relationship queries compare the geometries of two spatial objects in the same coordinate space (CRS).
There are many, often similar, functions to perform these queries (can be confusing!).
These operations may return logical values, lists, matrices, dataframes, geometries or spatial objects
you need to check what type of object is returned
you need to check what values are returned to make sure they make sense
This is a very common type of spatial query called a point-in-polygon query.
We can use the rgeos function gIntersects to answer this.
We already know the answer but we want to see how it is done.
What is the output of gIntersects?
bart_stations_in_sf <-gIntersects(bart_utm, sfboundary)
# bart_stations_in_sf
What is the output of gIntersects?
bart_stations_in_sf <-gIntersects(bart_utm, sfboundary)
bart_stations_in_sf
## [1] TRUE
Same function, but byid=TRUE What is the output of gIntersects this time?
sfbart_stations <-gIntersects(bart_utm, sfboundary, byid=TRUE)
# class(sfbart_stations)
# sfbart_stations
What is the output of gIntersects?
sfbart_stations <-gIntersects(bart_utm, sfboundary, byid=TRUE)
class(sfbart_stations)
## [1] "matrix"
sfbart_stations
## 1 2 3 4 5 6 7 8 9 10 11 12
## 0 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 13 14 15 16 17 18 19 20 21 22 23 24
## 0 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 25 26 27 28 29 30 31 32 33 34 35 36 37
## 0 FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## 38 39 40 41 42 43 44
## 0 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Why is it different?
We can use the result of gIntersects to view the names of the bart stations
bart_utm[as.vector(sfbart_stations),]$STATION
## [1] "EMBARCADERO" "MONTGOMERY STREET"
## [3] "POWELL STREET" "CIVIC CENTER/ UN PLAZA"
## [5] "16TH STREET & MISSION" "24TH STREET & MISSION"
## [7] "GLEN PARK" "BALBOA PARK"
We can use the result of gIntersects to subset bart_utm
sfbart_utm <- bart_utm[as.vector(sfbart_stations),]
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(sfboundary) +
tm_polygons(col="beige", border.col="black") +
tm_shape(sfbart_utm) +
tm_dots(col="red")
tmap to plot modetmap_mode("plot")
## tmap mode set to plotting
rgeos::gIntersects(geom1, geom2) returns
byid=FALSE (the default when not specified)byid=TRUELet’s consider the sfhomes15_utm data along with SF Census tract data
Use readOGR to load the shapefile sftracts_wpop.shp
data folderExplore the data
sp object is it?sftracts <- readOGR(dsn="./data", layer="sftracts_wpop")
## OGR data source with driver: ESRI Shapefile
## Source: "./data", layer: "sftracts_wpop"
## with 195 features
## It has 10 fields
class(sftracts)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
proj4string(sftracts)
## [1] "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"
head(sftracts@data)
## STATEFP COUNTYFP TRACTCE AFFGEOID GEOID NAME LSAD
## 0 06 075 010700 1400000US06075010700 06075010700 107 CT
## 1 06 075 012201 1400000US06075012201 06075012201 122.01 CT
## 2 06 075 013102 1400000US06075013102 06075013102 131.02 CT
## 3 06 075 013500 1400000US06075013500 06075013500 135 CT
## 4 06 075 015500 1400000US06075015500 06075015500 155 CT
## 5 06 075 016300 1400000US06075016300 06075016300 163 CT
## ALAND AWATER pop14
## 0 183170 0 5311
## 1 92048 0 4576
## 2 237886 0 2692
## 3 235184 0 2592
## 4 311339 0 3918
## 5 245867 0 4748
plot(sftracts)
A spatial join associates rows of data in one object with rows in another object based on the spatial relationship between the two objects.
A spatial join is based on the comparison of two sets of geometries in the same coordinate space.
The default spatial relationship is intersects
We need to spatially join the sftracts and sfhomes15_utm to answer this.
What spatial object are we joining data from? to?
Spatial overlay operations in R are implemented using over
Simple point-in-polygon operations use sp::over
More complex geometric comparisons use rgeos::over
You need to have rgeos package installed!
over(x,y)over(x,y)
over(x,y, returnList=TRUE) as:
See ?over for details.
In what census tract is each SF property located?
homes_with_tracts <- over(sfhomes15_utm, sftracts)
If not, why not?
The over function, like almost all spatial analysis functions, requires that both data sets be spatial objects (they are) with the same coordinate reference system (CRS). Let’s investigate
# What is the CRS of the property data?
proj4string(sfhomes15_utm)
# What is the CRS of the census tracts?
proj4string(sftracts)
# Transform the CRS for tracts to be the same as that for sfhomes15_sp
sftracts_utm <- spTransform(sftracts, CRS(proj4string(sfhomes15_utm)))
# make sure the CRSs are the same
proj4string(sftracts_utm) == proj4string(sfhomes15_utm)
## [1] TRUE
Now let’s try that overlay operation again
In what tract is each SF property is located?
homes_with_tracts <- over(sfhomes15_utm, sftracts_utm)
over outputWhat is our output? Does it answer our question?
What type of data object did the over function return?
Do we need to add returnList = TRUE
homes_with_tracts <- over(sfhomes15_utm, sftracts_utm)
class(homes_with_tracts)
nrow(homes_with_tracts)
nrow(sftracts_utm)
nrow(sfhomes15_utm)
What do we have here?
homes_with_tracts <- over(sfhomes15_utm, sftracts_utm)
class(homes_with_tracts)
## [1] "data.frame"
nrow(homes_with_tracts)
## [1] 835
nrow(sftracts_utm)
## [1] 195
nrow(sfhomes15_utm)
## [1] 835
head(homes_with_tracts)
## STATEFP COUNTYFP TRACTCE AFFGEOID GEOID NAME LSAD
## 24 06 075 030800 1400000US06075030800 06075030800 308 CT
## 35 06 075 061400 1400000US06075061400 06075061400 614 CT
## 76 06 075 060700 1400000US06075060700 06075060700 607 CT
## 79 06 075 033203 1400000US06075033203 06075033203 332.03 CT
## 85 06 075 020800 1400000US06075020800 06075020800 208 CT
## 90 06 075 011200 1400000US06075011200 06075011200 112 CT
## ALAND AWATER pop14
## 24 1243203 0 5646
## 35 797747 0 5625
## 76 2082513 844390 9804
## 79 437672 0 3931
## 85 346098 0 6182
## 90 177415 0 3078
over discussionOur output homes_with_tracts is a data frame that contains
sfhomes15_utm - in same ordersfhomes15_utmsftracts_utm@data including the census tract id (GEOID)So we are close to answering our question.
But for the data to be useful we need - to add the GEOID column in homes_with_tracts to sfhomes15_utm
CAUTION: this only works because the data are in the same order!
sfhomes15_utm$home_geoid <- homes_with_tracts$GEOID
# Review and note the change
#head(sfhomes15_utm@data)
Data linkage via space!
The over operation gave us the census tract data info for each point in sfhomes15_utm
We added the GEOID for each point to the @data slot of sfhomes15_utm
We can now join sfhomes15_utm points by GEOID to any census variable, eg median household income, and then do an analysis of the relationship between, for example, property value and that variable.
How would we do that?
The sf_med_hh_income2015.csv file only has two columns: GEOID and medhhinc.
Because GEOIDs can have leading zeros, we set the colClasses to make sure they are not stripped.
med_hh_inc <- read.csv("data/sf_med_hh_income2015.csv", stringsAsFactors = F,
colClasses = c("character","numeric"))
head(med_hh_inc)
## GEOID medhhinc
## 1 06075980401 0
## 2 06075990100 0
## 3 06075012502 11925
## 4 06075012301 13909
## 5 06075061100 16545
## 6 06075980501 16638
We can use sp::merge to join the med_hh_inc DF to the sfhomes15_utm SPDF.
We should make sure that they share a column of common values - GEOID / home_geoid
We use sp:merge not regular merge to maintain the integrity of the sp object
Join two data objects based on common values in a column.
Use sp:merge to join a data frame to a spatial object with a data frames (S*DF objects)
sfhomes15_utm <- merge(sfhomes15_utm,
med_hh_inc, by.x="home_geoid", by.y="GEOID")
IMPORTANT: DO NOT merge a data frame to the @data slot!
head(sfhomes15_utm@data, 2) # take a look with View(sfhomes15_utm@data)
## home_geoid FiscalYear SalesDate Address
## 549 06075030800 2015 2015-08-21 0000 2760 19TH AV0015
## 758 06075061400 2015 2015-08-13 0000 0560AMISSOURI ST0000
## YearBuilt NumBedrooms NumBathrooms NumRooms NumStories NumUnits
## 549 1979 2 2 5 0 1
## 758 2003 2 2 5 1 1
## AreaSquareFeet ImprovementValue LandValue Neighborhood
## 549 1595 432500 432500 West of Twin Peaks
## 758 1191 701280 701280 Potrero Hill
## Location SupeDistrict totvalue SalesYear
## 549 (37.7360097396496, -122.474067310226) 7 865000 2015
## 758 (37.759197817252, -122.396516184449) 10 1402560 2015
## medhhinc
## 549 131699
## 758 138897
sp::merge resultstmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(sfhomes15_utm) + tm_dots(col="medhhinc")
tmap_mode("plot")
## tmap mode set to plotting
We now know the tract for each property.
Now let’s think about this question from the tract perspective.
Let’s ask the question
Since we joined GEOID to each property we can use the non-spatial aggregate function to compute the mean of totvalues for each GEOID.
But let’s use the sp aggregate function instead!
It’s actually more straight forward.
Aggregate data values in one spatial object by the geometry of another using the specified function.
What is the mean home value in each census tract?
tracts_with_mean_val <- aggregate(x = sfhomes15_utm["totvalue"],
by = sftracts_utm, FUN = mean)
Wow, so simple. What does that give us?
sp::aggregateclass(tracts_with_mean_val)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
head(tracts_with_mean_val@data)
## totvalue
## 0 NA
## 1 482000.0
## 2 702000.0
## 3 2825000.0
## 4 800983.3
## 5 1946666.7
nrow(tracts_with_mean_val) == nrow(sftracts_utm)
## [1] TRUE
sp::aggregate returned a SpatialPolygonsDataFrame
The SPDF has the same geometry as sftracts_utm
But the tracts_with_mean_val@data slot only contains the mean totvalue for each tract.
To make these data more useful, let’s add this value to sftracts_utm
This only works because their are the same number of elements both @data slots and they are in the same order!
sftracts_utm$mean_totvalue <- tracts_with_mean_val$totvalue
head(sftracts_utm@data) # check it
## STATEFP COUNTYFP TRACTCE AFFGEOID GEOID NAME LSAD
## 0 06 075 010700 1400000US06075010700 06075010700 107 CT
## 1 06 075 012201 1400000US06075012201 06075012201 122.01 CT
## 2 06 075 013102 1400000US06075013102 06075013102 131.02 CT
## 3 06 075 013500 1400000US06075013500 06075013500 135 CT
## 4 06 075 015500 1400000US06075015500 06075015500 155 CT
## 5 06 075 016300 1400000US06075016300 06075016300 163 CT
## ALAND AWATER pop14 mean_totvalue
## 0 183170 0 5311 NA
## 1 92048 0 4576 482000.0
## 2 237886 0 2692 702000.0
## 3 235184 0 2592 2825000.0
## 4 311339 0 3918 800983.3
## 5 245867 0 4748 1946666.7
Make a map of the results to make sure they seem reasonable.
First, set tmap to interactive mode
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(sftracts_utm) +
tm_polygons(col="mean_totvalue", border.col=NA)
tm_shape(sftracts_utm) +
tm_polygons(col="mean_totvalue", border.col=NA) +
tm_shape(sfhomes15_utm) + tm_dots()
Many methods of spatial analysis are based on distance queries.
For example, point pattern analysis considers the distance between features to determine whether or not they are clustered.
We can also use distance as a way to select features spatially.
What properties are within walking distance of BART?
In order to select properties with 1KM of BART
We then do a point-in-polygon operation to either count the number of properties within the buffer or compute the mean totvalue.
rgeos is the muscle for
We can use the rgeos::gBuffer function to create our buffer polygon
The rgeos::gBuffer function takes as input a spatial object or objects to buffer and a buffer distance.
Let’s assume 1KM is standard walking distance.
bart_1km_buffer <- gBuffer(sfbart_utm, width=1000)
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(bart_1km_buffer) + tm_polygons(col="red") +
tm_shape(sfbart_utm) + tm_dots()
How does this differ when we add byid=TRUE
bart_1km_buffer_byid <- gBuffer(sfbart_utm, width=1000, byid=TRUE)
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(bart_1km_buffer_byid) + tm_polygons(col="red") +
tm_shape(sfbart_utm) + tm_dots()
What operation could we use?
Which buffer polygons?
Why byid=TRUE
What is the output?
homes_near_bart <- gIntersects(bart_1km_buffer, sfhomes15_utm, byid=TRUE)
class(homes_near_bart)
## [1] "matrix"
head(homes_near_bart)
## buffer
## 24 FALSE
## 35 FALSE
## 76 FALSE
## 79 FALSE
## 85 TRUE
## 90 FALSE
# subset
sfhomes15_utm_near_bart <- sfhomes15_utm[as.vector(homes_near_bart),]
tm_shape(bart_1km_buffer) + tm_polygons(col="red") +
tm_shape(sfhomes15_utm_near_bart) + tm_dots()
That was a whirlwind tour of just some of the methods of spatial analysis.
There was a lot we didn’t and can’t cover.
sf package is emerging and will eclipse sp
Raster data is a another major topic! - but the raster package is the key
Introductory Tutorials
Geodata Visualization emphasis
Deep dive Tutorials that include spatial analysis
CRAN Spatial Packages
library(knitr)
purl("r-geospatial-workshop-pt2.Rmd", output = "scripts/r-geospatial-workshop-pt2-May-2018.R", documentation = 2)